AITopics | West Haven

Collaborating Authors

West Haven

VSF-Med:A Vulnerability Scoring Framework for Medical Vision-Language Models

arXiv.org Artificial IntelligenceJul-2-2025

Vision Language Models (VLMs) hold great promise for streamlining labour-intensive medical imaging workflows, yet systematic security evaluations in clinical settings remain scarce. We introduce VSF--Med, an end-to-end vulnerability-scoring framework for medical VLMs that unites three novel components: (i) a rich library of sophisticated text-prompt attack templates targeting emerging threat vectors; (ii) imperceptible visual perturbations calibrated by structural similarity (SSIM) thresholds to preserve clinical realism; and (iii) an eight-dimensional rubric evaluated by two independent judge LLMs, whose raw scores are consolidated via z-score normalization to yield a 0--32 composite risk metric. Built entirely on publicly available datasets and accompanied by open-source code, VSF--Med synthesizes over 30,000 adversarial variants from 5,000 radiology images and enables reproducible benchmarking of any medical VLM with a single command. Our consolidated analysis reports mean z-score shifts of $0.90σ$ for persistence-of-attack-effects, $0.74σ$ for prompt-injection effectiveness, and $0.63σ$ for safety-bypass success across state-of-the-art VLMs. Notably, Llama-3.2-11B-Vision-Instruct exhibits a peak vulnerability increase of $1.29σ$ for persistence-of-attack-effects, while GPT-4o shows increases of $0.69σ$ for that same vector and $0.28σ$ for prompt-injection attacks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.00052

Country:

North America > United States > Connecticut > New Haven County > West Haven (0.04)
Europe > Belgium > Flanders (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

X-Guard: Multilingual Guard Agent for Content Moderation

Upadhayay, Bibek, Behzadan, Vahid, D, Ph.

arXiv.org Artificial IntelligenceApr-15-2025

Large Language Models (LLMs) have rapidly become integral to numerous applications in critical domains where reliability is paramount. Despite significant advances in safety frameworks and guardrails, current protective measures exhibit crucial vulnerabilities, particularly in multilingual contexts. Existing safety systems remain susceptible to adversarial attacks in low-resource languages and through code-switching techniques, primarily due to their English-centric design. Furthermore, the development of effective multilingual guardrails is constrained by the scarcity of diverse cross-lingual training data. Even recent solutions like Llama Guard-3, while offering multilingual support, lack transparency in their decision-making processes. We address these challenges by introducing X-Guard agent, a transparent multilingual safety agent designed to provide content moderation across diverse linguistic contexts. X-Guard effectively defends against both conventional low-resource language attacks and sophisticated code-switching attacks. Our approach includes: curating and enhancing multiple open-source safety datasets with explicit evaluation rationales; employing a jury of judges methodology to mitigate individual judge LLM provider biases; creating a comprehensive multilingual safety dataset spanning 132 languages with 5 million data points; and developing a two-stage architecture combining a custom-finetuned mBART-50 translation module with an evaluation X-Guard 3B model trained through supervised finetuning and GRPO training. Our empirical evaluations demonstrate X-Guard's effectiveness in detecting unsafe content across multiple languages while maintaining transparency throughout the safety evaluation process. Our work represents a significant advancement in creating robust, transparent, and linguistically inclusive safety systems for LLMs and its integrated systems.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2504.08848

Country:

Europe > Germany (0.14)
North America > United States > Connecticut > New Haven County > West Haven (0.04)
North America > United States > California (0.04)
(3 more...)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine learning driven search of hydrogen storage materials

Banerjee, Tanumoy, Ji, Kevin, Xia, Weiyi, Ouyang, Gaoyuan, Del Rose, Tyler, Hlova, Ihor Z., Ueland, Benjamin, Johnson, Duane D., Wang, Cai-Zhuan, Balasubramanian, Ganesh, Singh, Prashant

arXiv.org Artificial IntelligenceMar-5-2025

The transition to a low-carbon economy demands efficient and sustainable energy-storage solutions, with hydrogen emerging as a promising clean-energy carrier and with metal hydrides recognized for their hydrogen-storage capacity. Here, we leverage machine learning (ML) to predict hydrogen-to-metal (H/M) ratios and solution energy by incorporating thermodynamic parameters and local lattice distortion (LLD) as key features. Our best-performing ML model provides improvements to H/M ratios and solution energies over a broad class of ternary alloys (easily extendable to multi-principal-element alloys), such as Ti-Nb-X (X = Mo, Cr, Hf, Ta, V, Zr) and Co-Ni-X (X = Al, Mg, V). Ti-Nb-Mo alloys reveal compositional effects in H-storage behavior, in particular Ti, Nb, and V enhance H-storage capacity, while Mo reduces H/M and hydrogen weight percent by 40-50%. We attributed to slow hydrogen kinetics in molybdenum rich alloys, which is validated by our pressure-composition isotherm (PCT) experiments on pure Ti and Ti5Mo95 alloys. Density functional theory (DFT) and molecular simulations also confirm that Ti and Nb promote H diffusion, whereas Mo hinders it, highlighting the interplay between electronic structure, lattice distortions, and hydrogen uptake. Notably, our Gradient Boosting Regression model identifies LLD as a critical factor in H/M predictions. To aid material selection, we present two periodic tables illustrating elemental effects on (a) H2 wt% and (b) solution energy, derived from ML, and provide a reference for identifying alloying elements that enhance hydrogen solubility and storage.

alloy, arxiv 03 04 2025, composihon, (15 more...)

arXiv.org Artificial Intelligence

2503.04027

Country:

North America > United States > Iowa > Story County > Ames (0.04)
North America > United States > Connecticut > New Haven County > West Haven (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Austria > Vienna (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Materials > Chemicals > Industrial Gases (1.00)
Energy > Renewable > Hydrogen (1.00)
Energy > Energy Storage (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Cognitive Overload Attack:Prompt Injection for Long Context

Upadhayay, Bibek, Behzadan, Vahid, Karbasi, Amin

arXiv.org Artificial IntelligenceOct-15-2024

Large Language Models (LLMs) have demonstrated remarkable capabilities in performing tasks across various domains without needing explicit retraining. This capability, known as In-Context Learning (ICL), while impressive, exposes LLMs to a variety of adversarial prompts and jailbreaks that manipulate safety-trained LLMs into generating undesired or harmful output. In this paper, we propose a novel interpretation of ICL in LLMs through the lens of cognitive neuroscience, by drawing parallels between learning in human cognition with ICL. We applied the principles of Cognitive Load Theory in LLMs and empirically validate that similar to human cognition, LLMs also suffer from cognitive overload a state where the demand on cognitive processing exceeds the available capacity of the model, leading to potential errors. Furthermore, we demonstrated how an attacker can exploit ICL to jailbreak LLMs through deliberately designed prompts that induce cognitive overload on LLMs, thereby compromising the safety mechanisms of LLMs. We empirically validate this threat model by crafting various cognitive overload prompts and show that advanced models such as GPT-4, Claude-3.5 Sonnet, Claude-3 OPUS, Llama-3-70B-Instruct, Gemini-1.0-Pro, and Gemini-1.5-Pro can be successfully jailbroken, with attack success rates of up to 99.99%. Our findings highlight critical vulnerabilities in LLMs and underscore the urgency of developing robust safeguards. We propose integrating insights from cognitive load theory into the design and evaluation of LLMs to better anticipate and mitigate the risks of adversarial attacks. By expanding our experiments to encompass a broader range of models and by highlighting vulnerabilities in LLMs' ICL, we aim to ensure the development of safer and more reliable AI systems.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.11272

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Connecticut > New Haven County > West Haven (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.85)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Twins in rotational spectroscopy: Does a rotational spectrum uniquely identify a molecule?

Schwarting, Marcus, Seifert, Nathan A., Davis, Michael J., Blaiszik, Ben, Foster, Ian, Prozument, Kirill

arXiv.org Artificial IntelligenceApr-5-2024

Rotational spectroscopy is the most accurate method for determining structures of molecules in the gas phase. It is often assumed that a rotational spectrum is a unique "fingerprint" of a molecule. The availability of large molecular databases and the development of artificial intelligence methods for spectroscopy makes the testing of this assumption timely. In this paper, we pose the determination of molecular structures from rotational spectra as an inverse problem. Within this framework, we adopt a funnel-based approach to search for molecular twins, which are two or more molecules, which have similar rotational spectra but distinctly different molecular structures. We demonstrate that there are twins within standard levels of computational accuracy by generating rotational constants for many molecules from several large molecular databases, indicating the inverse problem is ill-posed. However, some twins can be distinguished by increasing the accuracy of the theoretical methods or by performing additional experiments.

collision, conformer, molecule, (13 more...)

arXiv.org Artificial Intelligence

2404.04225

Country:

North America > United States > Illinois > Cook County > Lemont (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Connecticut > New Haven County > West Haven (0.04)

Genre: Research Report (0.82)

Industry:

Energy (0.68)
Health & Medicine (0.68)
Materials (0.46)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

An integrated framework for developing and evaluating an automated lecture style assessment system

Dimitriadou, Eleni, Lanitis, Andreas

arXiv.org Artificial IntelligenceDec-28-2023

The aim of the work presented in this paper is to develop and evaluate an integrated system that provides automated lecture style evaluation, allowing teachers to get instant feedback related to the goodness of their lecturing style. The proposed system aims to promote improvement of lecture quality, that could upgrade the overall student learning experience. The proposed application utilizes specific measurable biometric characteristics, such as facial expressions, body activity, speech rate and intonation, hand movement, and facial pose, extracted from a video showing the lecturer from the audience point of view. Measurable biometric features extracted during a lecture are combined to provide teachers with a score reflecting lecture style quality both at frame rate and by providing lecture quality metrics for the whole lecture. The acceptance of the proposed lecture style evaluation system was evaluated by chief education officers, teachers and students regarding the functionality, usefulness of the application, and possible improvements. The results indicate that participants found the application novel and useful in providing automated feedback regarding lecture quality. Furthermore, the performance evaluation of the proposed system was compared with the performance of humans in the task of lecture style evaluation. Results indicate that the proposed system not only achieves similar performance to human observers, but in some cases, it outperforms them.

evaluation, facial expression, hand movement, (15 more...)

arXiv.org Artificial Intelligence

2312.00201

Country:

Asia > Singapore (0.04)
North America > United States > Connecticut > New Haven County > West Haven (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Education > Educational Setting > Higher Education (0.46)
Education > Assessment & Standards > Assessment Methods (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes

Upadhayay, Bibek, Behzadan, Vahid

arXiv.org Artificial IntelligenceNov-17-2023

LLMs such as ChatGPT and PaLM can be utilized to train on a new language and revitalize low-resource languages. However, it is evidently very costly to pretrain pr fine-tune LLMs to adopt new languages. Another challenge is the limitation of benchmark datasets and the metrics used to measure the performance of models in multilingual settings. This paper proposes cost-effective solutions to both of the aforementioned challenges. We introduce the Multilingual Instruction-Tuning Dataset (MITS), which is comprised of the translation of Alpaca-52K, Dolly-15K, and Vicuna Benchmark in 132 languages. Also, we propose a new method called \emph{TaCo: Translation-Assisted Cross-Linguality}, which make uses of translation in a chain-of-thought process to instruction-tune LLMs on a new languages through a curriculum learning process. As a proof of concept, we experimented with the instruction-tuned Guanaco-33B model and performed further instruction tuning using the TaCo method in three low-resource languages and one high-resource language. Our results show that the TaCo method impresses the GPT-4 with 82% for a low-resource language in the Vicuna Benchmark dataset, and boosts performance by double in contrast to the performance of instruction tuning only. Our results show that TaCo is a promising method for creating multilingual LLMs, even for low-resource languages. We have released our datasets and the model adapters, and encourage the research community to make use of these resources towards advancing work on multilingual LLMs.

arxiv preprint arxiv, dataset, instruction, (13 more...)

arXiv.org Artificial Intelligence

2311.10797

Country:

North America > United States > Connecticut > New Haven County > West Haven (0.04)
South America (0.04)
North America > Central America (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Theory of Mind Approach as Test-Time Mitigation Against Emergent Adversarial Communication

Piazza, Nancirose, Behzadan, Vahid

arXiv.org Artificial IntelligenceFeb-14-2023

Multi-Agent Systems (MAS) is the study of multi-agent interactions in a shared environment. Communication for cooperation is a fundamental construct for sharing information in partially observable environments. Cooperative Multi-Agent Reinforcement Learning (CoMARL) is a learning framework where we learn agent policies either with cooperative mechanisms or policies that exhibit cooperative behavior. Explicitly, there are works on learning to communicate messages from CoMARL agents; however, non-cooperative agents, when capable of access a cooperative team's communication channel, have been shown to learn adversarial communication messages, sabotaging the cooperative team's performance particularly when objectives depend on finite resources. To address this issue, we propose a technique which leverages local formulations of Theory-of-Mind (ToM) to distinguish exhibited cooperative behavior from non-cooperative behavior before accepting messages from any agent. We demonstrate the efficacy and feasibility of the proposed technique in empirical evaluations in a centralized training, decentralized execution (CTDE) CoMARL benchmark. Furthermore, while we propose our explicit ToM defense for test-time, we emphasize that ToM is a construct for designing a cognitive defense rather than be the objective of the defense.

agent, artificial intelligence, communication, (11 more...)

arXiv.org Artificial Intelligence

2302.07176

Country:

North America > United States > New York (0.04)
North America > United States > Connecticut > New Haven County > West Haven (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)

Add feedback

Leveraging Contextual Relatedness to Identify Suicide Documentation in Clinical Notes through Zero Shot Learning

Workman, Terri Elizabeth, Goulet, Joseph L., Brandt, Cynthia A., Warren, Allison R., Eleazer, Jacob, Skanderson, Melissa, Lindemann, Luke, Blosnich, John R., Leary, John O, Treitler, Qing Zeng

arXiv.org Artificial IntelligenceJan-9-2023

Identifying suicidality including suicidal ideation, attempts, and risk factors in electronic health record data in clinical notes is difficult. A major difficulty is the lack of training samples given the small number of true positive instances among the increasingly large number of patients being screened. This paper describes a novel methodology that identifies suicidality in clinical notes by addressing this data sparsity issue through zero-shot learning. U.S. Veterans Affairs clinical notes served as data. The training dataset label was determined using diagnostic codes of suicide attempt and self-harm. A base string associated with the target label of suicidality was used to provide auxiliary information by narrowing the positive training cases to those containing the base string. A deep neural network was trained by mapping the training documents contents to a semantic space. For comparison, we trained another deep neural network using the identical training dataset labels and bag-of-words features. The zero shot learning model outperformed the baseline model in terms of AUC, sensitivity, specificity, and positive predictive value at multiple probability thresholds. In applying a 0.90 probability threshold, the methodology identified notes not associated with a relevant ICD 10 CM code that documented suicidality, with 94 percent accuracy. This new method can effectively identify suicidality without requiring manual annotation.

artificial intelligence, documentation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2301.03531

Country:

North America > United States > California (0.14)
North America > United States > Connecticut > New Haven County > West Haven (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Spain > Galicia > Madrid (0.04)

Genre:

Research Report > Experimental Study (0.48)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Novel Approach For Generating Customizable Light Field Datasets for Machine Learning

Huang, Julia, Smith, Toure, Patro, Aloukika, Chhabra, Vidhi

arXiv.org Artificial IntelligenceDec-13-2022

To train deep learning models, which often outperform traditional approaches, large datasets of a specified medium, e.g., images, are used in numerous areas. However, for light field-specific machine learning tasks, there is a lack of such available datasets. Therefore, we create our own light field datasets, which have great potential for a variety of applications due to the abundance of information in light fields compared to singular images. Using the Unity and C# frameworks, we develop a novel approach for generating large, scalable, and reproducible light field datasets based on customizable hardware configurations to accelerate light field deep learning research.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2212.06701

Country:

North America > United States > Michigan (0.04)
North America > United States > Connecticut > New Haven County > West Haven (0.04)
North America > United States > Colorado > Douglas County > Castle Rock (0.04)
(2 more...)

Genre:

Research Report > Promising Solution (0.61)
Overview > Innovation (0.61)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback